Analyzing fraudulent emails using the MapReduce algorithm to extract key linguistic patterns.
This project applies the MapReduce algorithm to analyze the Fraudulent E-Mail Corpus, a dataset of over 2,500 phishing emails. The goal is to identify the 20 most frequently used words and evaluate how word frequency reflects the phishing nature of the dataset.
Click the link below to view the full code and documentation for this project on GitHub:
View on GitHub